Dialogue as Collaborative Tree Growth
نویسندگان
چکیده
This paper applies Dynamic Syntax (Kempson et al., 2001) to dialogue modelling, and provides a characterisation of production that is relative to a converse process of tree growth which constitutes a parse check. As evidence for this approach, we place it in a psycho-linguistic perspective, using it to model (i) the parsing/production of elliptical expressions, (ii) dialogue properties such as a speaker-feedback mechanism, speaker/hearer alignments of structure, and speaker/hearer role reversal in shared utterances (Pickering and Garrod, forthcoming). Very generally in the current research perspective, production and parsing are taken to be independent applications of a neutral centrally available grammar formalism. In abandoning this assumption and articulating a parsing-directed grammar formalism, the Dynamic Syntax framework (Kempson et al., 2001) faces the challenge of articulating the relationship between them. In this paper, we explore the formal mechanisms needed to model generation (as an idealised form of production) given the parsing-directed assumptions of Dynamic Syntax. We then show how the closeness of correlation posited between parsing and production mechanisms allows a natural extension to dialogue modelling that meets the challenge set by (Pickering and Garrod, forthcoming) that grammar formalisms be evaluated relative to their success in capturing key properties of dialogue. 1 Background Assumptions Dynamic Syntax is a model of NL understanding in which parsing is defined as the progressive projection of a decorated tree structure following the left-right sequence of words in the string. Logical forms are represented as decorated trees, whose topnode is decorated with a formula Fo(α) of type t, and whose dominated nodes are decorated with subterms of the formula α. The central concept is that of goal-directed tree growth, defined using the modal tree logic LOFT (Blackburn and Meyer-Viol, 1994) with basic operators 〈↓〉, 〈↑〉 for mother and daughter relations respectively. Tree growth is defined over partial trees, each sequence of such trees starting from the requirement at a rootnode ?Ty(t) constituting the overall requirement to establish a logical form of type t at Tn(0) (Tn for treenode). To this node, additional requirements such as ?〈↓〉Ty(e), ?〈↓〉Ty(e → t), are added as subgoals, and lead to the introduction of daughter nodes with requirements ?Ty(e), ?Ty(e → t). Tree nodes are thus created from the root downwards with imposed requirements which are subsequently met by tree growth actions dictated by incoming lexical items as they are parsed in a left-right sequence. Processes such as function application are defined in tandem with type-deduction to decorate nonterminal nodes as pairs of terminal nodes are successfully decorated. (At each stage, there is one itemised node under development, indicated by a pointer, ♦.) A complete tree is one which projects a logical form (with a formula of type t decorating its top node). All update processes are monotonic, progressively developing a tree structure meeting the requirements which are imposed on nodes as they are introduced. In addition to the concept of requirement are other concepts of structural underspecification. Formula values may be underspecified, Proceedings of EDILOG 2002 70 eg for anaphoric expressions, which project formulae of the form Fo(U), U a metavariable. Underspecified tree relations also may be introduced (for left-dislocated expressions), with a treenode identified as dominated by some treenode a without at that point in the treegrowth process any fixed extension (its treenode described as 〈↑∗〉Tn(a) with a requirement for a fixed extension, ?∃x.Tn(x)). These various forms of underspecification interact in the process of progressive satisfaction of all imposed requirements through computational, lexical and pragmatic actions,1 each of which constitutes a monotonic step of tree growth (Ty(e) is a development of ?Ty(e), Fo(Mary) is a development of Fo(U), 〈↑ 〉Tn(a) is an update of 〈↑∗〉Tn(a), and so on). Wellformedness is defined in terms of the result of such actions: a sentence is wellformed if and only if at least one completed tree structure can be derived from a sequence of words, with no requirements outstanding. In using a tree description language, the system is like other parsing formalisms using descriptions of partial trees (Duchier and Gardent, 2001), (Joshi and Kallmeyer, forthcoming), (Koller et al., 2000), or descriptions of trees (Sturt and Crocker, 1996). But unlike them, partial trees are the basis for the grammar formalism, and not merely for semantic characterisations or parsing algorithms relative to an independently defined syntax. The entire concept of syntax is founded in this concept of growth of semantic representations along a left to right dimension without any intermediate and independent syntactic level of representation (Kempson et al., 2001). Quantification in this system is expressed using the epsilon calculus (Meyer-Viol, 1995) with variable-binding term operators of type e in the style of arbitrary names of natural deduction systems for predicate logic. A sentence such as A man smokes is taken to project a logical form Smoke( , x,Man(x)), derived like other aspects of the interpretation process through incremental construction A pragmatic process of substitution is presumed to provide the update for the meta-variables lexically projected by anaphoric expressions. Defined as an architecture within which steps of parsing can be articulated, the framework has nothing to say about the pragmatic constraints that determine actual choices. from lexical specifications which only partially determine the resulting formula. Like [Copestake et al, 1999], scope is defined through an incrementally collected set of scope constraints allowing lexical variation; and these scope constraints jointly determine the evaluation of the constructed logical form. The form Fo(Smoke( , x,Man(x))), for example, is evaluated relative to a scope statement S < x (S a variable representing the index of evaluation) as Fo(S : Man(a) ∧ Smoke(a)), where a = ( x,Man(x) ∧ Smoke(x)). Underspecification of scope determination is thus not expressed through underspecified treerelations (as in (Erk et al., forthcoming),(Joshi and Kallmeyer, forthcoming)) but through metavariables in scope statements projected, for example, by indefinites. Like all other aspects of underspecification, these must be resolved during the construction process. 2 Dynamic Syntax and Production Assuming this model of parsing, we wish to articulate the relation between parsing and an idealised production model.2 What we aim to provide is a “tactical” generation system which takes a source tree as input and incrementally “checks” off nodes in this tree as a progressively enriched partial tree can be successfully induced by some selected word. The pointer in this partial parse tree picks out the node whose analogue in the source tree is being “checked”, an action which is matched by the selection of some word from the lexicon and “writing” it at the righthand edge of a sequence of already established words. Such checking action is licensed if and only if the word selected projects a compound parse action which lead to an update of the parse tree from that defined over the words already decided upon, mapping that partial tree onto an update reflecting the annotations on the node being checked. To see the general dynamics, consider a simple tree representing the content of John saw a woman with a formula Fo(SPAST : See( , x,Woman(x))(John)) decorating its rootnode, with accompanying scope statement This paper does not address the problem of phonological/phonetic levels of realisation and their relationship to parsability of the output. Kempson & Otsuka / Dialogue as Collaborative Tree Growth 71 SPAST < x.3 We assume initially that the starting point for any production task is a full tree as source representing interpretation of the string, and a parse tree made up of a single rootnode with pointer and requirement ?Ty(t) (though see section 5 where both inputs to production and parsing are generalised to arbitrary partial trees). Generation steps are then licensed relative to some associated parsing step. Within the source tree, the generating system can, for example, “check off” the subject node, and “generate” the first word in the string, because there is a parsing routine whereby a combination of node-introducing rules operating on the rootnode together with the lexical actions of the name John can lead to the successful annotation of the subject node as Fo(John), T y(e). The search, then, through the lexicon is to find the word which provides these actions. In this paper, this aspect of the search is trivialised, by making the Fodorian assumption that words and concepts correspond one-to-one. However the selection of a pronoun is nontrivial, and is licensed as long as the underspecified input provided is sufficient for the hearer to identify the term intended, given the context provided.4 Having checked off the node associated with the subject in the source tree in virtue of a successful step in the parse tree annotating the subject node in that tree, the next production step again follows the move of the parsing pointer which is back to the rootnode of the parse tree. From there a following computational step allows the introduction of a predicaterequiring node. Accordingly, the generator attempts to check the content of its predicate node in the source tree. With verbs defined as being of the form IF ?Ty(e → t), THEN..., the lexicon is scanned for a word which will lead to the introduction and decoration of a node with Fo(see), T y(e → (e → t). The word saw is duly selected, which in addition We take SPAST as a shorthand representation of tense, SPAST the representation of the index relative to which the logical form Fo(See( , x, Woman(x))(John)) is evaluated. In a fuller account of proper names, these too would require nontrivial identification of the individual being talked about with a lexical characterisation involving a meta-variable requiring substitution. However, we leave all such issues aside. enables the information about the tense construal also to be checked; and it is written at the right edge of the sequence, yielding John saw. Given that the actions of the word saw leave the pointer in the parse tree at the object node, that node alone remains to be realised by some sequence of words. The term Fo( , x,Man(x)) can then be checked off because there is a sub-sequence of actions provided by the determiner and noun that introduce a subtree decorated with Fo(λP. , P ) and Fo(x,Man(x)) for some fresh variable x combining together to yield Fo( , x,Man(x)). Once the hearer can be presumed to have a tree with all terminal nodes annotated, the nonterminal nodes then have to be taken as checked, given the automatic parse steps of functional-application/type-deduction that would apply to decorate nonterminal nodes in the parsing routine. The pattern is quite general. Generation involves three trees: one a fully annotated tree which forms the input to the process; second a tree with a subset of its nodes checked; and third, a tree which reflects the corresponding parse tree commensurate with establishing the node currently being checked. The provided sequence of actions is by no means unique. Any sequence of actions conforming to the pattern of pairing source tree incrementally with emergent parse tree following the sequence of words is licensed: eg left-peripheral placement of any “dislocated” NP presuming on a parse-sequence projecting an initially unfixed node (see section 4 below, and (Kempson et al., 2001)). 3 Ellipsis as Tree Abstraction Without independent motivation, such a checking process would be nothing more than stipulation, and indication of the need for a separate production “grammar”. The challenge is to reduce this checking system to some operation independently needed for natural language processing. In the interpretation of ellipsis, we find the mechanism we need, this being a tree-abstraction process enabling partial structures to be re-used. For example, consider the parsing of the elliptical fragment in (1), which can be interpreted either as “Harry saw Bill” or as “John saw Harry”, the choice being free and determined pragmatically: Proceedings of EDILOG 2002 72 (1) John saw Bill. Harry too. For this fragment, a subtree needs to be constructed from the tree established from the first sentence as the basis for interpreting the fragment Harry. This subtree is (1a) or (1b):
منابع مشابه
Effects of Audio-Visually Prompted Collaborative Dialogue on EFL Learners' Listening Comprehension Development
This study investigated the comparative effects of audio-visually prompted collaborative dialogue on the listening comprehension development of symmetrical, asymmetrical, and asymmetrical teacher-fronted EFL learner groups. Besides, it explored the attitude of the participants of the groups concerning the effectiveness of collaborative dialogue for their listening comprehension improvement. The...
متن کاملMulti-tasking and Collaborative Activities in Dialogue Systems
We explain dialogue management techniques for collaborative activities with humans, involving multiple concurrent tasks. Conversational context for multiple concurrent activities is represented using a “Dialogue Move Tree” and an “Activity Tree” which support multiple interleaved threads of dialogue about different activities and their execution status. We also describe the incremental message ...
متن کاملUsing Collaborative Discourse Theory to Partially Automate Dialogue Tree Authoring
We have developed a novel methodology combining hierarchical task networks with traditional dialogue trees that both partially automates dialogue authoring and improves the degree of dialogue structure reuse. The key to this methodology is a lightweight utterance semantics derived from collaborative discourse theory, making it a step towards dialogue generation based on cognitive models rather ...
متن کاملCompletions, Coordination, and Alignment in Dialogue
Collaborative completions are among the strongest evidence that dialogue requires coordination even at the sub-sentential level; the study of sentence completions may thus shed light on a number of central issues both at the `macro’ level of dialogue management and at the `micro’ level of the semantic interpretation of utterances. We propose a treatment of collaborative completions in PTT, a th...
متن کاملThe use of Domain Ontologies for Improving the Adaptability and Collaborative Ability of a Web Dialogue System
In this paper we describe the use of domain ontologies in a mixed-initiative web dialogue system for improving both its adaptability and its collaborative ability. Dialogue systems guiding the user when accessing the web services can enhance web usability, however they are expensive to develop and difficult to adapt to different types of web services. The use of the web service knowledge model ...
متن کاملFlexible Spoken Dialogue System based on User Models and Dynamic Generation of VoiceXML Scripts
We realize a telephone-based collaborative natural language dialogue system. Since natural language involves very various expressions, a large number of VoiceXML scripts need to be prepared to handle all possible input patterns. We realize flexible dialogue management for various user utterances by generating VoiceXML scripts dynamically. Moreover, we address appropriate user modeling in order ...
متن کامل